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Abstract. 

We derive the non-asymptotical non-uniform sharp error estimation for Bern¬ 
stein’s approximation of continuous function based on the modern probabilistic ap¬ 
paratus. 

We investigate also the convergence of derivative of these polynomials and we 
will consider briefly also the multivariate case. 
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1 Introduction. Notations. Statement of prob¬ 
lem. 

Let / = /(x), X G [0,1] be continuous: / G C[0,1], numerical function, n = 2, 3,..., 

B»[/1W = E (”) / (^) (1 - (1-0) 

be its Bernstein’s polynomial of degree n. 

It is reasonable to dehne formally 

^oo[/](x) = /(x), /(•) G C[0,1]. (1.0a) 

Denote also as ordinary 

uj[f]{6)= sup \f{x)-f{y)\- (1.1) 

^,y&[0A],\x-y\<S 

module (modulus) of continuity of the function / = /(x) of a hrst order. 


A|/](i) = I - f{x) 


( 1 . 2 ) 


be an non-uniform error of the Bernstein’s approximation of a order for the 
function /. 

The problem of error estimation for Bernstein’s approximation of continuous 
function goes back to the classical work belonging to S.N.Bernstein [3] (1912). He 
proved in the famous article [3] that as n —)■ cxd the sequence Bn[f]{x) of a 
Bernstein’s polynomial of degree n converges uniformly as n ^ cxd to the initial 
function f[x) : 

lim max I Bn\f]{x) — fix) I = lim max AJfiix) = 0. 

«^ooxe[0,l] ' J \ \ n^ooj;g[o,l] ^ 

There exists a huge number of publications about the estimation of a value 
A„[/](a;), see for example [1], [4], [10], [11], [17], [18], [19], [23], [25], [30], [31], [32] 
and many others; see also reference therein. 

Our purpose in this short report is obtaining a sharp np to multiplica¬ 
tive constant non-uniform and non-asymptotical nniversal estimate for 
the error of the Bernstein’s approximate for the arbitrary initial contin- 
nons fnnction in the terms of its modnle of continuity of the first order. 

We intend to use too modern probabilistic apparatus: theory of Grand Lebesgue 
Spaces of random variables, in particular, theory of Strictly Subgaussian random 
variables; and introduce and use some generalization of this notion; Almost Strictly 
Subgaussian random variables. 

Note that the non-uniform estimate are ’’much” better in the considered case 
than uniform ones, see [23], [24], [18] . 


2 Auxiliary apparatus: subgaussian and strictly 
subgaussian random variables. 


Let {G, B, P} be some sufficiently rich probability space with expectation E. 

Definition 2.1. 

We say that the centered: E,^ = 0 numerical random variable (r.v.) f is sub¬ 
gaussian, or equally, belongs to the space Sub(G), if there exists some non-negative 
constant r > 0 such that 

VA G i? ^ Eexp(A.^) < exp[A^ r^/2]. (2-1). 

The minimal value r satisfying (2.1) for all the values X E R is called a 
subgaussian norm of the variable write 

||.^|| Sub = inf{r, r > 0 : VA G i? ^ Eexp(A.^) < exp(A^ r^/2)}. 

Evidently, 



( 2 . 2 ) 


11.^| I Sub = sup 

This important notion was introduced by J.P.Kahane [14]; V.V.Buldygin and 
Yu.V.Kozachenko in [6] proved that the set Sub(r2) relative the norm 11 ■ 11 is complete 
Banach space which is isomorphic to subspace consisting only from the centered 
variables of Orlicz’s space over {Q,B,P) with N — Orlicz-Young function N{u) = 
exp(u^) — 1; see also [16]. 

The detail investigation of this class or random variables with very interest 
applications into the theory of random helds reader may found in the book [8]; we 
reproduce here some main facts from this monograph. 

Definition 2.1. The tail function T^(u), u > 0 for the numerical valued random 
variable rj is as usually dehned as follows 

Trj{u) ‘^= max[P (?7 > u), P(.^ < —u)], u > 0. 

If for instance ||.^|| Sub = r G (0, cxd), then 

T^{u) = max[P(,^ > u),P{^ < —u)] < exp(—u^/(2r^)), u > 0; (2.3) 

and the last inequality is in general case non-improvable. It is sufficient for this 
to consider the case when the r.v. ^ has the centered Gaussian non-degenerate 
distribution. 

Conversely, if E,^ = 0 and if for some positive hnite constant K 

T^{u) = max[P(,^ > u),P{^ < —u)] < exp(—a; > 0, 
then ^ G Sub(r2) and ||.^|| Sub < AK. 

The subgaussian norm in the subspace of the centered r.v. is equivalent to the 
following Grand Lebesgue Space (GLS) norm: 

|||{|||:=sup[J%], kl, = [Eklf-'* . 

s>l [V'5 

More detail investigation of these spaces see in the monograph [20], chapter 1. 

Denote in the sequel for brevity for any r.v. r] 

(T^(p) = = Var ?7 = — (Ep)^. 

Definition 2.2. (See [8], chapter 1; [21].) The subgaussian r.v. ^ is said to be 
Strictly Subgaussian, write ^ G SSub, iff 

VA G i? ^ Ee^« < (2.4) 

or equally 


21nEexp(A.^)/|A| . 


||e||Sub<a(0 = ||ell^2(fi). 

Recall that always ||.^|| Sub > a{^) = 11.^||L 2 (fi), so that 


(2.4a) 



e e SSub(fi) ^ = 0, lien Sub = a(e) = Iieil^ 2 (f^). (2.46) 

Many examples of strictly subgaussian distributions may be found in the book 
of V.V.Buldygin and Yu.V.Kozatchenko [8], chapter 1. For instance, arbitrary mean 
zero Gaussian distributed r.v. is strictly subgaussian, including the case when this 
r.v. is equal to zero a.e.; the symmetric Rademacher’s r.v. p with distribution P(p = 
1) = P(p = —1) = 1/2 belongs to the set SSub(r2). The random variable rj which 
has an uniform distribution on the symmetrical interval (—6, b ), b = const G (0, cxd) 
is Strictly Subgaussian. 

Consider also for instance following the authors [8] the r.v. ( with the following 
density: 

q; + 1 

/c(^) = — (1 “ kr) -^(1^1 ^1)) = const > 0, (2.5) 

2q; 

where I {A) = I{A,x) = 1, x E A; I {A) = I{A,x) = 0, x ^ A is indicator function; 
then ( E SSub(r2). 

This example is interesting because the kurtosis of the r.v. ( is zero if a = 
^- 3 . 

The convenience of these notions is following. Let {'C(*)}) * = 1,2, ...,n be 
(centered) independent subgaussian r.v. Denote 

n n 

>S(n) = X^^(f), S2(n) = X^(||^(i)||Sub)2. (2.6) 

i=l i=l 

Then ||S'(n)|| Sub < S(n) and following 

max(P(S'(n)/S(n) > x), P(S'(n)/S(n) < —x)) < x > 0, (2.7) 

the tail or concentrations inequalities. 

If in addition are identical distributed and (3 := ||^(1)|| Sub G (0, cxd), then 

snp 11S'(n)/ y/n\\ Snb = /3 

n 

and 


snp max(P(S'(n)/(/9\/n) > x), (P{S(n)/{/Sy/n) < —x) < e a; > 0, (2.8) 

n 

If in addition the r.v. ^{i) are strictly snbgaussian, the estimate (2.8) may be 
reinforced by lower estimate used the classical CLT: 

supP{S{n)/{/3^/n) > x) > lim P{S{n)/{/3\/n) > x) = 

(27r)-^/2 dy>C a; > 1. (2.9) 

Jx 

Definition 2.3. The non-degenerate centered random variable u with variance 
a^,0 < a < oo is said to be almost strictly subgaussian, briefly z/ G ASSnb, if for 
all the real valnes X E R 



Ecosh(Az//(T) < exp (A^/2^ . 


( 2 . 10 ) 


Proposition 2.1. 

It follows immediately from the definition (2.10) the following tail estimate for 
these variables by means of Tchebychev-Chernov inequality 

Tu{u) < 2 exp (^—u^, u>0. (2.11) 

Proof. Obviously, if the r.v u is almost strictly subgaussian: u G ASSub then 
c ■ z/ G ASSub. c = const. 

Let now u G ASSub with <7 = 1. We have by means of Tchebychev’s inequality 
for the positive values A and u : 

AV2 A2/2 

rj~\ / \ 

~ cosh(AM) 0.5(e^“ + e~^^) ~ 

2exp(A^/2 — Am) = 2exp(— m^/2), 

if we choose \ = u. 

Note that analogous approach appears at hrst in an article [22], where was 
applied to the investigation of the Central Limit Theorem in Banach space. 

Denote also 


6 = 6{p) = ■ (1 — p), 0 < p < 1. 


( 2 . 12 ) 


Let us now formulate and prove the main result of this section. Denote by p = 
hn = hn,p the r.v. having Bernoulli (Binomial) distribution: Law/i = Bin(n,p), n = 
1,2,...; 0 < p < 1. Recall that 


and 


P(p = m) 



p™ (1-p) 


m = 0,1, 2,..., n 


E/i = np, Var p, = np{l — p)=n 6{p). 

Remark 2.1. The non-centered random variable C, may be named Almost 
Strictly Subgaussian iff it has a hrst moment and the centered r.v. C, — E(C satishes 
the dehnition 2.3. 

Theorem 2.1 (in our dehnitions and notations). 

The random variables p, — np, 0 < p < 1, n = 1,2,... are almost strictly 
subgaussian uniformly for all the values n. 


Proof. The inequality 



f 27771 ^ 

E(/x - npf^ < 0^(p), m = 0,1, 2,... (2.13) 

is proved in the book of G.G.Lorentz [18], page 14. We get denoting the centered 
and normed r.v. 


for all the real values A G i? : 


p : = 


fi — np 

\/np(l^^ 


oo 




^ A2”^ (2m)! _ “ A^”^ 

(2m)! 2"^ ml ^ 


m=0 


m=0 


2™ m! 




(2.14) 


Remark 2.2. The inequality 


In In 


(1 - p)e-^P + 


<p{l-p) ■ 


A2 
2 ’ 


(2.15) 


where 1/2 < p < 1 and A > 0 is proved at hrst in an article of D.Berend and 
A.Kontorovich [2], lemma 5, page 4; see also some applications in [28] and a prelim¬ 
inary results in [7],[8], [15]. 

Perhaps, this form of the probabilistic inequality was known for the specialists 
in the approximation theory, see e.g. [13], [11], chapter 10, [23], [24]. 


3 Main result. Exactness. 

Let us introduce the following (sublinear) operator (more precisely, the sequence of 
operators) 

f°° ..in (^ ^ 


u[f] 

Jo 


z exp 1-I dz. 

n I ^ \ 2 


Denote also 


W = W{x) '^= sup sup 

const^/GC[0,l] n=l,2,... 


An[/](x) 


(3.0) 


Theorem 3.1. ITe assert that for all the values x G (0,1) 


-<W = W(x) < 2. 

71 


(3.1) 


Proof. Upper bound. 

Let X G (0,1); both the degenerate cases a; = 0 or a; = 1 are trivial. 



The expression for Bn[f]{x) may be represented likewise in the initial work of 
S.N.Bernstein [3] as follows 


B4f]{x) = Efifx/n), (3.2) 

where the r.v. fi has a binomial (Bernoulli) distribution with parameters Ep = 
nx, Var/i = nx{l — x). The r.v. fi — nx is equal to the sum 

n n 

H-nx = J2^Aj): ^ l^/n - X = J2^x{j)/n, (3.3) 

j= t=i 

where the r.v. has a binary distribution 

P(^ 3 , = l-a;)=a; = l- P(^ 3 , = -x), x e (0,1) 

and the random variables ^x{j) are independent copies of the r.v. ^x- 
Put 


n 

C = Cn,x = ^x{j)/Vn, (3.4) 

i=i 

then fi/n = x + Cn,xl\/n. As we knew, the linear combination of strictly subgaus- 
sian independent random variables are also strictly subgaussian and we conclude 
therefore on the basis of theorem 2.1 

TCn,.(«) < 2 exp /2e‘^{x)) . (3.5) 

We can and will suppose without loss of generality that the r.v. (^ri,x has a 
symmetrical distribution such that 

= 2 exp (-v^l2e‘^{x)) . (3.5a) 

More precisely, let C = Cn,x be a symmetrically distributed r.v. such that 

= 2 exp (-u^/2e‘^{x)) , (3.55) 

then for any positive monotonically increasing function G = G{z), z > 0 

EG{Cn,x) < EG{Cn,x). 

We estimate consequently 

An[/ia) = I -S„[/](x) - f(x) I = 


E(/(i + C/\/^J)-/W) <Ei.)|/| 


n 


and we deduce after integration by parts relaying on the inequality (3.5) 
An[/](a;)<2 exp (^-y^2e\x)) dy = 



(3.6) 


2 ^ w[/] zsJ'P (“^V2) dz = 2 J„|/](x). 

Thus, we proved the upper bound: hh < 2. 

Remark 3.1 Note that 6{x) <1/2, therefore 

An[/](a:)<2^ y exp(^-y^^ dy. (3.6a) 

Thus, we obtained easily the uniform Bernstein’s error estimate through the 
uniform one. 

Remark 3.2 It follows immediately from the estimation (3.6a) and on the basis 
of Lebesgue dominated convergence theorem that 

lim max A„[fl(a;) = 0, 

n-^oOa.g[o,i] L ^ 

i.e. the classical Bernstein’s result. 

Examples 3.1. 


Let the function f = f{x) be Holderian: 

3a = const G (0,1], Ha = const < oo 


\f{xi) - f{x 2 )\ < Ha \xi - a; 2 |“, xi,X 2 e [0,1]; 


or equally 


u[f]{6)<HaS^, Ha<CX>. 

It follows immediately from theorem 3.1 


A„[/](a;) < 2 


2a;(l — x) 


n 


a/2 


a 


(3.7) 


(3.7a) 


(3.8) 


If in addition a = 1 (Lipshitz condition) and if we denote as ordinary L = Ri G 
(0, cxd), then 


A„[/](a;)<2L 


27ra;(l — x) 


n 


1/2 


(3.8a) 


See also [19], [25]. 

Lower bound. 


Ranko Bojanic in the article [4], see also [5], [31], [30] considered the family of 
a trial functions of a form 



gt{x) = \t- x\, t,x e [0,1], 

which if Lipshitzian relative the variable x, (as well as relative the variable t), with 
constant L = I, and proved in fact that as n —)■ cxd 




2x{l — x) 


Tin 


1/2 


+ 0 (- 

.n 


(3.9) 


We conclude therefore that for all the values x from the (open) interval x G (0,1) 

^n[9t\ix) 


W{x) > lim 


Jn[9t]{x) 

This completes the proof of theorem 3.1. 


= I/tt. 


(3.10) 


4 Convergence of derivative for Bernstein’s poly¬ 
nomials. 


Let now in this section the aforementioned function / be continuous differentiable: 
/ G (^^[O,!]. It is known that the derivatives B'[f]{x) converges uniformly to one 
for the function /'(x), see [12], [33]. 

We intend to obtain here the rehned estimation also non-asymptotical and non- 
uniform for the approximation 

max \B'^[f]{x) - f\x)\. (4.1.) 

Theorem 4.1. 

A;[/|(i)< 5I..[/'](1)+2 J„_,l/'|(i) = 
rV'] (^) + 2 {^0=) ^ (-7) « > 2. (4.2) 

Proof. 

Let n > 2. The expression for B'[f]{x) is given, for example, in the article [33]: 

= y n (i) (" 7 b i'(l - i)”-‘-J, (4.3) 

j=0 \ J J 

where 

‘'"-m(()"'(^)-j(() 

is a difference operator for the function / with step 1/n at the point j/n. 

Note that 



(4.4) 


n V„4f] - /' (i) 

Further, since n >2 




therefore (see theorem 3.1) 

K[f]{x) < ^Uj[f] + I Bn-l[f]{x) - f{x) I < 

^a;[/']Q)+2J._i[/'](a;), (4.5) 

Q.E.D. 

Remark 4.1. 

As for the lower estimate for Bernstein derivative estimate, consider the trial 
functions of the form 


Gt = Gt{x) := [ gt{y)dy, 

Jo 

then for all the values t G [0,1] 

Ai[G,](x) > ^ w[G;(x)] z exp(-z^/2) dz. 


(4.6) 


(4.7) 


5 Multivariate Bernstein’s polynomials. 

The speed of convergence. 

Let now / = f{x,y), {x,y) G [0,1]^ be continuous numerical function dehned 
on the square [0,1]^, ni,n 2 > 2 be integer numbers. 

One can dehne the modulus of continuity of the multivariate function f = f(x, y) 
as follows 


sup sup |/(a;i,t/i) -/(a; 2 ,t/ 2 )|, 5i,2 G [0,1]. (5.0) 

hl-yi|<5l \x2-y2\<&2 

The multivariate (more exactly, bivariate) Bernstein’s polynomial Bnp,n 2 [f]{x,y) 
of an order (ni,n 2 ) one can dehne by a formula 

Bnun2[f]{x,y) 


kho k^oW/W)^ Ui’ 


a;fcl(l_a;)«l-fcl yk2^l_y-^n2-k2 


(5.1) 



see, e.g. [18], [33]. 

We define formally by the analogy with the one-dimensional case 

B„..»[/lfe!/) = f; f"'] f(-,v) (l-x)"-'" (5.1o) 

Vni / 

and analogously 

fioo,n2[/](a^,l/) f ^ (1 - (5.16). 

Finally, 

Boo,oo[f]{x,y) /(.,.) e (^([0,1]^) . 

Dehne also the two-dimensional non-uniform error of Bernstein approximation 

^n^,n 2 [f]{x,y) := I Bn,,n 2 [f]{x,y) - f{x,y) |. (5.2) 

Introduce analogously to the one-dimensional case the following operator (more 
precisely, the sequence of operators) Jn^^n 2 [f]{x,y) 


zi ■ 9 {x) 


Z 2 ■ 9{y) 


Zi Z2 exp 


dzi dZ2. 

2 2 / 


Theorem 5.1. We propose as before that for all the values x, y G [0,1] and 
d = 2 


TT-I < IF 2 = W 2 (x, y) sup sup ( 5 . 4 ) 

const^/SC[ 0 ,l]^ ?il, 2 —_ dni,n2 [f ] (.Xf y ) 

Proof. 

Our reasoning are likewise ones in the section 3. Namely, let 0 < x,?/ < 1; we 

have 

Bnun2[f]{x,y) = ^f (5.5) 

\ni n2/ 

where the independent random variables pi and 112 have the integer binomial 
(Bernoulli) distribution with the parameters correspondingly 


E/ii = nix, E/i 2 = n2y, Var pi = nix{l — x), Var/i 2 = n2y{l — y). (5.6) 

Each such the variable may be represented as above 


where both the r.v. and C, 2 ,y are independent and are as above almost strictly 
subgaussian. 

We can and will suppose without loss of generality that the r.v. C, 2 ,y have 
a symmetrical distribution such that 

= exp (-ul/2d‘^{x)) , (5.8a) 

Tc 2 ,y{u 2 ) = exp (-ul/2d'^{y)) (5.85) 

and are independent. 

It is no hard to calculate as above the densities of both the independent variables 

Cl,a;; C2,y 

We estimate consequently 

^n^,n 2 [f]{x,y) = I Bn^^n 2 [f]{x,y) - f{x,y) I =2‘^x 

|E(/(a; + Ci,x/V^,l/ + C 2 ,j//\/^) - f{x,y))\ < 2‘^Eca[/] > 

and we deduce after integration twice by parts analogously to the proof of the 
inequality (3.5) A„^,„J/](a;,y) < 2"'x 



Vi V2 

e^{x) e^{y) 


exp (^-vl/2e^{x) 


v‘l/29‘^{ri)^ dvi dv2 = 


2d . 



/ xi ■ 9{x) 

V 


Z2-9{y) \ 

/ 


zi Z 2 exp 




dzi dz2 = 


2" Jni,nj/](a:,|/). (5.9) 

Thus, we proved the upper bound: W 2 < 2^^. 

To derive the lower bound, it is sufficient to consider as the capacity of an 
example the factorable function, namely 


fo{x,y) = gt,{x) ■ h{y), tie(0 ,l), h(-) e ^[0,1], 

so that 


Bnun 2 [fo]{x,y) = Br,,[gt,]{x) ■ Bn^[h]{y); 


and hnally 


uj[M{Su 62 ) =uj[gt^]{6i) •a;[/i](52), 


Jni,n2 [fo\{x,y) Jni 



Let us choose, for example, h{y) = const = 1. It follows immediately from the 
second proposition of theorem 3.1 


Q.E.D. 


W2{x,y) > 


Jm,oo[fo]{.x,y) 


JnAatMx) 


W{x) = I/tt, 


(5.10) 


More general approach. 

Let 11 (x, I/) 11 be any non-degenerate norm on the plane for instance, 

ll(a^il/)ll = \/x^ + y^, ||(a;,|/)|| = max(|a;|, ||/|), ||(a;,|/)|| = |a;| + ||/| (5.11) 

and so one. 

In accordance with this dehnition suppose that there exists non-negative con¬ 
tinuous non-decreasing function 7((5i,(52) = 7[/](<^i, <^ 2 ) such that 7 ( 0 -|-, 0 -|-) = 0 
and 


\f{xi,yi) - f{x 2 ,y 2 )\ < 7[/](ll(a^i 


We deduce as above I/) < 2'^x 


X2,yi-y2)\\)- 



Vi V2 

e^x) e^y) 


exp (^-vl/2e‘^{x) 


vl/2e‘^{y)^ dvi dv2. 

(5.12) 


6 Concluding remarks. 


It is no hard to generalize the results of the last section into the ’’more” multivariate 
case d = 3,4, 5,...; as well as into other methods of approximation, if only they had 
a probabilistic representation. 

One can also investigate and improve the rate of convergence of partial deriva¬ 
tives for the multivariate Bernstein’s polynomials. 
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